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Section 17 



Metrics for convergence of laws. 
Empirical measures. 

Levy-Prohorov metric. Consider a metric space {S,d). For a set ^ C 5 let us denote by 

= {y G S : d{x, y) < s for some x G A} 

its e-neighborhood. Let B be a Borel a-algebra on S. 

Definition. If P, Q are probability distributions on B then 

p(P, Q) = inf {£ > : ¥{A) < Q{A^) + e for all AgB) 

is called the Levy-Prohorov distance between P and Q. 

Lemma 34 p is a metric on the set of probability laws on B. 

Proof. 1. First, let us show that p(Q,P) = p{F,Q)- Suppose that p(P,Q) > e. Then there exists a set A 
such that P(^) > Q{A^) + e. Taking complements gives 

Q{A"=) > P(yl^) + £ > FiA^"') + e, 

where the last inequality follows from the fact that A" D A^'^ : 

a e A^'^ d(a, A^") < e => d{a, b) < e for some b e A^'' 

jsince b ^ A'',d{b,A) > e| 
=> d{a,A)>0^'a^A=^aGA''. 

Therefore, for a set S = A"", Q{B) > F{B^) + e. This means that p(Q,P) > e and, therefore, p(Q,P) > 
p(P,Q). By symmetry, p(Q,P) < p(P,Q) and p(Q,P) = p(P,Q). 

2. Next, let us show that if p(P, Q) = then P = Q. For any set f and any n > 1, 

P(F) <Q(F^) + -. 

n 

If F is closed then F" J,Fasn— >oo and by continuity of measure 

p(i^) <Q(n^") =Q(^)- 

Similarly, F{F) > Q{F) and, therefore, P(F) = Q(F). 
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3. Finally, let us prove the triangle inequality 

p(P,M) <p(P,Q)+p(Q,K). 
If /9(P,Q) < X and /9(Q,ffi) < y then for any set A, 

V{A) < Q(A^) + a; < R{{A'')y) +y + x< K(A^+f ) +x + y, 
which means that p(P, R) < x + y. 

□ 

Bounded Lipschitz metric. Given probability distributions P, Q on the metric space {S, d) we define a 
bounded Lipschitz distance between them by 

/J(P,Q)=sup{|y fdF- J fdq\ : ||/||bl<i}. 
Lemma 35 (3 is a metric on the set of probability laws on B. 

Proof. /3(P, Q) = /3(Q,P) and the triangle inequality are obvious. It remains to prove that /3(P, Q) = 
implies P = Q. Given a closed set F, the sequence of functions fm{x) ~ md{x,F) A 1 converges "f Ijj, 
where U = F'^. Obviously, H/toHbl < m + 1 and, therefore, / fmdV = J fmdQ- Letting m ^ oo proves that 
¥{U)=Q{U). 

□ 

The law P on {S, d) is tight if for any e > there exists a compact K C S such that F{S \K) < e. 

Theorem 40 (Ulam) If {S, d) is separable then for any law V on B there exists a closed totally bounded set 
K C S such that F{S \K) < e. If {S, d) is complete and separable then K is compact and, therefore, every 
law is tight. 

Proof. Consider a sequence {xi,X2, ■ ■ ■} that is dense in S. For any m > 1, S = IJi^i ^(^Xi, , where B 
denotes a closed ball, and by continuity of measure, for large enough n(m), 

n(m) 

B/'t-- , , , 

mJJ ~ 2™ 



.(s\U5(x..i))< ^ 

1=1 



If we take 

n(m) 
m>l i—1 

then 

nS\K)<j:^=e. 

m>l 

K is closed and totally bounded by construction. If S is complete, K is compact. 

Theorem 41 Suppose that either {S, d) is separable or P is tight. Then the following are equivalent. 

1. P„ ^ P. 

2. For all f € BL{S, d), J fdPn ^ / fdF. 

3. /3(P„,P) ^0. 
I p(P„,P)^0. 
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Proof. 1=^2. Obvious. 

3=>4. In fact, we will prove that 

p(P„,P)<2V/3(P„,P). 
Given a Borel set ACS, consider a function 

f{x) = V - ^d{x, A)^ such that Ia < f < U 

Obviously, ||/||bl < 1 + and we can write 

P„(A) < j fdFn = j fdF+(^j fdFn- j fdF) 

< P(^^) + (1 + £-1) sup{ j /dP„ - j fdF 
= P(A^) + (l + e-i)/3(P„,P) <P(A^) + <5, 



(17.0.1) 



BL 



<1 



where 5 = max(e, (1 + £ "'^)/3(P„, P)). This implies that /9(P„,P) < i5. Since e is arbitrary we can minimize 
5 = 6{e) over e. If we take e = vT? then 5 = max(v73, /3 + -^7/3) = /? + and 

/3<l^P<2V/5; /3> l^p< 1 <2V/9. 
4=>1. Suppose that p(P„,P) — > which means that there exists a sequence i such that 

P„(A) < P(A^'') + e„ for all measurable ACS. 
If A is closed, then nn>i ^^"^ ~ ^ ^^'^^ continuity of measure, 

limsupP„(A) < limsup(p(A^") + e„) = P(A). 

By the portmanteau theorem, P„ P. 

2=4>3. If P is tight, let -fsT be a compact such that V{S \K) < e. If {S, d) is separable, by Ulam's 
theorem, let ii" be a closed totally bounded set such that F{S \K) < e. If we consider a function 



f{x) = V (l - ^d{x, K)) with 



BL 



< 1 + 



1 



then 



\{K') > j fdFn^ j fdF> nK) > 1 - £, 



which implies that for n large enough, P„(/r^) > 1 — 2e. This means that all P„ are essentially concentrated 
on K^. Let 

B={f: ||/||BL(s,d) < l}, Bk = {/|^ : / e i?} C C{K), 

where /|^ denotes the restriction of / to K. If K is compact then, by the Arzela-Ascoli theorem, Bk is totally 
bounded with respect to d^a- If K is totally bounded then we can isometrically identify functions in Bk with 

their unique extensions to the c;onipletion K' of K and. by the Arzela-Ascoli theorem for the compact K' , 
Bk is again totally bounded with respect to d^. In any case, given £ > 0, we can find fi, - ■ ■ , fk G B such 
that for all / e B 

sup \f{x) — fj{x)\ < s for some j < k. 

This uniform approximation can also be extended to K^. Namely, for any x G take y G K such that 
d{x, y) < e. Then 

\f{x)-fi{x)\ < \f{x)- f{y)\ + \f{y)- fM\ + \fi{y)- m\ 
< \\f\\^d{x,y) + e + \\fi\\^d{x,y) <Ze. 
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Therefore, for any f G B, 

fdPn 



< 



< 



< 



fdFn- 



Finally, 



/3(P„,P) = sup 
/es 



< max 

l<3<k 



fdFn 



J fjdPn - J 

fjdFr, 



fjdF 



(P„(i^^=)+P(i^")) 
+ 2s + s 

3e + 3£ + 2e + £ 
+ 3£ + 3e + 3£ + 2£ + £ 

+ 12£. 



< max 
i<j<fe 



J fjdFn - J fjc 



12e 



and, using assumption 2, limsup^^^o /3(P„,P) < 12£. Letting £ — > finishes the proof. 

□ 

Convergence of empirical measures. Let (ri,P) be a probability spEice and Xi,X2, ... : — > S be an 
i.i.d. sequence of random variables with values in a metric space (S*, d). Let fi be the law of on S. Let us 
define the random empirical measures /x„ on the Borel a-algebra on 5 by 

1 " 

/i„(yl)(^) = - V/(X,M e A), AgB. 
n 

By the strong law of large numbers, for any / e Ci,{S), 

I fdl^n = - V f{Xi) ^ E/(Xi) = / /d/x a.s. 

However, the set of measure zero where this convergence is violated depends on / and it is not obvious that 
the convergence holds for all / e Ch{S) with probability one. 

Theorem 42 (Varadarajan) Let {S, d) be a separable metric space. Then converges to fi weakly almost 
surely, 

F(u) : /ti„(-)(t<;) /z weakly) = 1. 

Proof. Since {S, d) is separable, by Theorem 2.8.2 in R.A.P., there exists a metric e on S' such that {S, e) is 
totally bounded and e and d define the same topology, i.e. e(s„, s) ^ if and only if d(s„, s) 0. This, of 
course, means that Cb{S,d) = Cb{S,e) and weak convergence of measures does not change. If (T, e) is the 
completion of (5, e) then (T, e) is compact. By the Arzela-Ascoli theorem, BL{T, e) is separable with respect 
to the doo norm and, therefore, BL{S, e) is also separable. Let {fm) be a dense subset of BL{S, e). Then, by 
the strong law of large number. 



j fmdfln = ^ fmi^i) ^ = j f^diJ. a.s. 



Therefore, on the set of probability one, / fmdUn — * / fmdjJ, for all m > 1. Since {fm) is dense in BL{S, e), on 
the same set of probability one, / /d/i„ — > / fd/j. for all / e BL{S, e). Since {S, e) is separable, the previous 
theorem implies that Hn ^ l-i weakly. 

□ 
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